Towards an Open Service Architecture for Data Mining on the Grid
نویسندگان
چکیده
Across a wide variety of fields, huge datasets are being collected and accumulated at a dramatical pace. The datasets addressed by individual applications are very often heterogeneous and geographically distributed, and are used for collaboration by the communities of users, which are often large and also geographically distributed. There are major challenges involved in the efficient and reliable storage, fast processing, and extracting descriptive and predictive knowledge from this great mass of data. In this paper, we describe design principles and a service based software architecture of a novel infrastructure for distributed and high-performance data mining in Computational Grid environments. This architecture is designed and being implemented on top of the Globus 3.0 Alpha toolkit (it provides basic Grid services, such as authentication, information and resource management, etc.) and OGSA-DAI Grid Services (they provide basic access to Grid databases).
منابع مشابه
WS-DAI-DM: An Interface Specification for Data Mining in Grid Environments
Providing the appropriate access means for data mining services in Grid Environment is principal for combination of Grid and data mining. The transition from centralized data mining process as they are in traditional tools to Grid-compliant and Grid-based data mining services that can coordinate with each other is important to extract useful and potential knowledge/patterns from distributed dat...
متن کاملBiosimgrid: a Distributed Database for Biomolecular Simulations
Biomolecular simulations provide data on the conformational dynamics and energetics of complex biomolecular systems. We aim to exploit the e-science infrastructure developing in the UK to enable large scale analysis of the results of such simulations. In particular, the BioSimGrid project (www.biosimgrid.org) will provide a generic database for comparative analysis of simulations of biomolecule...
متن کاملGridMiner: An Infrastructure for Data Mining on Computational Grids
Knowledge discovery in datasets integrated into Grids is a challenging research task. These large datasets are being collected and accumulated across a wide variety of fields, at a dramatical pace. They are often heterogeneous and geographically distributed and globally used by large user communities. There are major challenges involved in the efficient and reliable storage, fast processing, in...
متن کاملArchitectural Plan for Constructing Fault Tolerable Workflow Engines Based on Grid Service
In this paper the design and implementation of fault tolerable architecture for scientific workflow engines is presented. The engines are assumed to be implemented as composite web services. Current architectures for workflow engines do not make any considerations for substituting faulty web services with correct ones at run time. The difficulty is to rollback the execution state of the workflo...
متن کاملDevelopment of a framework to evaluate service-oriented architecture governance using COBIT approach
Nowadays organizations require an effective governance framework for their service-oriented architecture (SOA) in order to enable them to use a framework to evaluate their current state governance and determine the governance requirements, and then to offer a suitable model for their governance. Various frameworks have been developed to evaluate the SOA governance. In this paper, a brief introd...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003